Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 100
Filtrar
1.
IEEE Trans Med Imaging ; PP2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-38587957

RESUMO

Accurate retinal layer segmentation on optical coherence tomography (OCT) images is hampered by the challenges of collecting OCT images with diverse pathological characterization and balanced distribution. Current generative models can produce high-realistic images and corresponding labels without quantitative limitations by fitting distributions of real collected data. Nevertheless, the diversity of their generated data is still limited due to the inherent imbalance of training data. To address these issues, we propose an image-label pair generation framework that generates diverse and balanced potential data from imbalanced real samples. Specifically, the framework first generates diverse layer masks, and then generates plausible OCT images corresponding to these layer masks using two customized diffusion probabilistic models respectively. To learn from imbalanced data and facilitate balanced generation, we introduce pathological-related conditions to guide the generation processes. To enhance the diversity of the generated image-label pairs, we propose a potential structure modeling technique that transfers the knowledge of diverse sub-structures from lowly- or non-pathological samples to highly pathological samples. We conducted extensive experiments on two public datasets for retinal layer segmentation. Firstly, our method generates OCT images with higher image quality and diversity compared to other generative methods. Furthermore, based on the extensive training with the generated OCT images, downstream retinal layer segmentation tasks demonstrate improved results. The code is publicly available at: https://github.com/nicetomeetu21/GenPSM.

2.
Artigo em Inglês | MEDLINE | ID: mdl-38530724

RESUMO

Disentanglement learning aims to separate explanatory factors of variation so that different attributes of the data can be well characterized and isolated, which promotes efficient inference for downstream tasks. Mainstream disentanglement approaches based on generative adversarial networks (GANs) learn interpretable data representation. However, most typical GAN-based works lack the discussion of the latent subspace, causing insufficient consideration of the variation of independent factors. Although some recent research analyzes the latent space on pretrained GANs for image editing, they do not emphasize learning representation directly from the subspace perspective. Appropriate subspace properties could facilitate corresponding feature representation learning to satisfy the independent variation requirements of the obtained explanatory factors, which is crucial for better disentanglement. In this work, we propose a unified framework for ensuring disentanglement, which fully investigates latent subspace learning (SL) in GAN. The novel GAN-based architecture explores orthogonal subspace representation (OSR) on vanilla GAN, named OSRGAN. To guide a subspace with strong correlation, less redundancy, and robust distinguishability, our OSR includes three stages, self-latent-aware, orthogonal subspace-aware, and structure representation-aware, respectively. First, the self-latent-aware stage promotes the latent subspace strongly correlated with the data space to discover interpretable factors, but with poor independence of variation. Second, the following orthogonal subspace-aware stage adaptively learns some 1-D linear subspace spanned by a set of orthogonal bases in the latent space. There is less redundancy between them, expressing the corresponding independence. Third, the structure representation-aware stage aligns the projection on the orthogonal subspace and the latent variables. Accordingly, feature representation in each linear subspace can be distinguishable, enhancing the independent expression of interpretable factors. In addition, we design an alternating optimization step, achieving a tradeoff training of OSRGAN on different properties. Despite it strictly constrains orthogonality, the loss weight coefficient of distinguishability induced by orthogonality could be adjusted and balanced with correlation constraint. To elucidate, this tradeoff training prevents our OSRGAN from overemphasizing any property and damaging the expressiveness of the feature representation. It takes into account both interpretable factors and their independent variation characteristics. Meanwhile, alternating optimization could keep the cost and efficiency of forward inference unchanged and will not burden the computational complexity. In theory, we clarify the significance of OSR, which brings better independence of factors, along with interpretability as correlation could converge to a high range faster. Moreover, through the convergence behavior analysis, including the objective functions under different constraints and the evaluation curve with iterations, our model demonstrates enhanced stability and definitely converges toward a higher peak for disentanglement. To depict the performance in downstream tasks, we compared the state-of-the-art GAN-based and even VAE-based approaches on different datasets. Our OSRGAN achieves higher disentanglement scores on FactorVAE, SAP, MIG, and VP metrics. All the experimental results illustrate that our novel GAN-based framework has considerable advantages on disentanglement.

3.
IEEE Trans Med Imaging ; PP2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38530715

RESUMO

Instrument-tissue interaction detection task, which helps understand surgical activities, is vital for constructing computer-assisted surgery systems but with many challenges. Firstly, most models represent instrument-tissue interaction in a coarse-grained way which only focuses on classification and lacks the ability to automatically detect instruments and tissues. Secondly, existing works do not fully consider relations between intra-and inter-frame of instruments and tissues. In the paper, we propose to represent instrument-tissue interaction as ⟨instrument class, instrument bounding box, tissue class, tissue bounding box, action class⟩ quintuple and present an Instrument-Tissue Interaction Detection Network (ITIDNet) to detect the quintuple for surgery videos understanding. Specifically, we propose a Snippet Consecutive Feature (SCF) Layer to enhance features by modeling relationships of proposals in the current frame using global context information in the video snippet. We also propose a Spatial Corresponding Attention (SCA) Layer to incorporate features of proposals between adjacent frames through spatial encoding. To reason relationships between instruments and tissues, a Temporal Graph (TG) Layer is proposed with intra-frame connections to exploit relationships between instruments and tissues in the same frame and inter-frame connections to model the temporal information for the same instance. For evaluation, we build a cataract surgery video (PhacoQ) dataset and a cholecystectomy surgery video (CholecQ) dataset. Experimental results demonstrate the promising performance of our model, which outperforms other state-of-the-art models on both datasets.

4.
Artigo em Inglês | MEDLINE | ID: mdl-38356213

RESUMO

RGB-D salient object detection (SOD) has gained tremendous attention in recent years. In particular, transformer has been employed and shown great potential. However, existing transformer models usually overlook the vital edge information, which is a major issue restricting the further improvement of SOD accuracy. To this end, we propose a novel edge-aware RGB-D SOD transformer, called, which explicitly models the edge information in a dual-band decomposition framework. Specifically, we employ two parallel decoder networks to learn the high-frequency edge and low-frequency body features from the low-and high-level features extracted from a two-steam multimodal backbone network, respectively. Next, we propose a cross-attention complementarity exploration module to enrich the edge/body features by exploiting the multimodal complementarity information. The refined features are then fed into our proposed color-hint guided fusion module for enhancing the depth feature and fusing the multimodal features. Finally, the resulting features are fused using our deeply supervised progressive fusion module, which progressively integrates edge and body features for predicting saliency maps. Our model explicitly considers the edge information for accurate RGB-D SOD, overcoming the limitations of existing methods and effectively improving the performance. Extensive experiments on benchmark datasets demonstrate that is an effective RGB-D SOD framework that outperforms the current state-of-the-art models, both quantitatively and qualitatively. A further extension to RGB-T SOD demonstrates the promising potential of our model in various kinds of multimodal SOD tasks.

5.
Artigo em Inglês | MEDLINE | ID: mdl-38381644

RESUMO

Super-resolving the magnetic resonance (MR) image of a target contrast under the guidance of the corresponding auxiliary contrast, which provides additional anatomical information, is a new and effective solution for fast MR imaging. However, current multi-contrast super-resolution (SR) methods tend to concatenate different contrasts directly, ignoring their relationships in different clues, e.g., in the high-and low-intensity regions. In this study, we propose a separable attention network (comprising high-intensity priority (HP) attention and low-intensity separation (LS) attention), named SANet. Our SANet could explore the areas of high-and low-intensity regions in the "forward" and "reverse" directions with the help of the auxiliary contrast while learning clearer anatomical structure and edge information for the SR of a target-contrast MR image. SANet provides three appealing benefits: First, it is the first model to explore a separable attention mechanism that uses the auxiliary contrast to predict the high-and low-intensity regions, diverting more attention to refining any uncertain details between these regions and correcting the fine areas in the reconstructed results. Second, a multistage integration module is proposed to learn the response of multi-contrast fusion at multiple stages, get the dependency between the fused representations, and boost their representation ability. Third, extensive experiments with various state-of-the-art multi-contrast SR methods on fastMRI and clinical in vivo datasets demonstrate the superiority of our model. The code is released at https://github.com/chunmeifeng/SANet.

6.
IEEE Trans Med Imaging ; PP2024 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-38206778

RESUMO

Color fundus photography (CFP) and Optical coherence tomography (OCT) images are two of the most widely used modalities in the clinical diagnosis and management of retinal diseases. Despite the widespread use of multimodal imaging in clinical practice, few methods for automated diagnosis of eye diseases utilize correlated and complementary information from multiple modalities effectively. This paper explores how to leverage the information from CFP and OCT images to improve the automated diagnosis of retinal diseases. We propose a novel multimodal learning method, named geometric correspondence-based multimodal learning network (GeCoM-Net), to achieve the fusion of CFP and OCT images. Specifically, inspired by clinical observations, we consider the geometric correspondence between the OCT slice and the CFP region to learn the correlated features of the two modalities for robust fusion. Furthermore, we design a new feature selection strategy to extract discriminative OCT representations by automatically selecting the important feature maps from OCT slices. Unlike the existing multimodal learning methods, GeCoM-Net is the first method that formulates the geometric relationships between the OCT slice and the corresponding region of the CFP image explicitly for CFP and OCT fusion. Experiments have been conducted on a large-scale private dataset and a publicly available dataset to evaluate the effectiveness of GeCoM-Net for diagnosing diabetic macular edema (DME), impaired visual acuity (VA) and glaucoma. The empirical results show that our method outperforms the current state-of-the-art multimodal learning methods by improving the AUROC score 0.4%, 1.9% and 2.9% for DME, VA and glaucoma detection, respectively.

7.
Genome Med ; 16(1): 12, 2024 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-38217035

RESUMO

Optimal integration of transcriptomics data and associated spatial information is essential towards fully exploiting spatial transcriptomics to dissect tissue heterogeneity and map out inter-cellular communications. We present SEDR, which uses a deep autoencoder coupled with a masked self-supervised learning mechanism to construct a low-dimensional latent representation of gene expression, which is then simultaneously embedded with the corresponding spatial information through a variational graph autoencoder. SEDR achieved higher clustering performance on manually annotated 10 × Visium datasets and better scalability on high-resolution spatial transcriptomics datasets than existing methods. Additionally, we show SEDR's ability to impute and denoise gene expression (URL: https://github.com/JinmiaoChenLab/SEDR/ ).


Assuntos
Comunicação Celular , Perfilação da Expressão Gênica , Humanos , Análise por Conglomerados
8.
Sci Data ; 11(1): 99, 2024 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-38245589

RESUMO

Pathologic myopia (PM) is a common blinding retinal degeneration suffered by highly myopic population. Early screening of this condition can reduce the damage caused by the associated fundus lesions and therefore prevent vision loss. Automated diagnostic tools based on artificial intelligence methods can benefit this process by aiding clinicians to identify disease signs or to screen mass populations using color fundus photographs as inputs. This paper provides insights about PALM, our open fundus imaging dataset for pathological myopia recognition and anatomical structure annotation. Our databases comprises 1200 images with associated labels for the pathologic myopia category and manual annotations of the optic disc, the position of the fovea and delineations of lesions such as patchy retinal atrophy (including peripapillary atrophy) and retinal detachment. In addition, this paper elaborates on other details such as the labeling process used to construct the database, the quality and characteristics of the samples and provides other relevant usage notes.


Assuntos
Miopia Degenerativa , Disco Óptico , Degeneração Retiniana , Humanos , Inteligência Artificial , Fundo de Olho , Miopia Degenerativa/diagnóstico por imagem , Miopia Degenerativa/patologia , Disco Óptico/diagnóstico por imagem
9.
IEEE Trans Med Imaging ; 43(3): 1237-1246, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37956005

RESUMO

Retinal arteriovenous nicking (AVN) manifests as a reduced venular caliber of an arteriovenous crossing. AVNs are signs of many systemic, particularly cardiovascular diseases. Studies have shown that people with AVN are twice as likely to have a stroke. However, AVN classification faces two challenges. One is the lack of data, especially AVNs compared to the normal arteriovenous (AV) crossings. The other is the significant intra-class variations and minute inter-class differences. AVNs may look different in shape, scale, pose, and color. On the other hand, the AVN could be different from the normal AV crossing only by slight thinning of the vein. To address these challenges, first, we develop a data synthesis method to generate AV crossings, including normal and AVNs. Second, to mitigate the domain shift between the synthetic and real data, an edge-guided unsupervised domain adaptation network is designed to guide the transfer of domain invariant information. Third, a semantic contrastive learning branch (SCLB) is introduced and a set of semantically related images, as a semantic triplet, are input to the network simultaneously to guide the network to focus on the subtle differences in venular width and to ignore the differences in appearance. These strategies effectively mitigate the lack of data, domain shift between synthetic and real data, and significant intra- but minute inter-class differences. Extensive experiments have been performed to demonstrate the outstanding performance of the proposed method.


Assuntos
Doenças Cardiovasculares , Doenças Retinianas , Veia Retiniana , Humanos
10.
Br J Ophthalmol ; 108(3): 432-439, 2024 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-36596660

RESUMO

BACKGROUND: Optical coherence tomography angiography (OCTA) enables fast and non-invasive high-resolution imaging of retinal microvasculature and is suggested as a potential tool in the early detection of retinal microvascular changes in Alzheimer's Disease (AD). We developed a standardised OCTA analysis framework and compared their extracted parameters among controls and AD/mild cognitive impairment (MCI) in a cross-section study. METHODS: We defined and extracted geometrical parameters of retinal microvasculature at different retinal layers and in the foveal avascular zone (FAZ) from segmented OCTA images obtained using well-validated state-of-the-art deep learning models. We studied these parameters in 158 subjects (62 healthy control, 55 AD and 41 MCI) using logistic regression to determine their potential in predicting the status of our subjects. RESULTS: In the AD group, there was a significant decrease in vessel area and length densities in the inner vascular complexes (IVC) compared with controls. The number of vascular bifurcations in AD is also significantly lower than that of healthy people. The MCI group demonstrated a decrease in vascular area, length densities, vascular fractal dimension and the number of bifurcations in both the superficial vascular complexes (SVC) and the IVC compared with controls. A larger vascular tortuosity in the IVC, and a larger roundness of FAZ in the SVC, can also be observed in MCI compared with controls. CONCLUSION: Our study demonstrates the applicability of OCTA for the diagnosis of AD and MCI, and provides a standard tool for future clinical service and research. Biomarkers from retinal OCTA images can provide useful information for clinical decision-making and diagnosis of AD and MCI.


Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Humanos , Angiofluoresceinografia/métodos , Vasos Retinianos/diagnóstico por imagem , Tomografia de Coerência Óptica/métodos , Doença de Alzheimer/diagnóstico por imagem , Microvasos/diagnóstico por imagem , Disfunção Cognitiva/diagnóstico por imagem
11.
Artigo em Inglês | MEDLINE | ID: mdl-38039172

RESUMO

Federated learning (FL) collaboratively trains a shared global model depending on multiple local clients, while keeping the training data decentralized to preserve data privacy. However, standard FL methods ignore the noisy client issue, which may harm the overall performance of the shared model. We first investigate the critical issue caused by noisy clients in FL and quantify the negative impact of the noisy clients in terms of the representations learned by different layers. We have the following two key observations: 1) the noisy clients can severely impact the convergence and performance of the global model in FL and 2) the noisy clients can induce greater bias in the deeper layers than the former layers of the global model. Based on the above observations, we propose federated noisy client learning (Fed-NCL), a framework that conducts robust FL with noisy clients. Specifically, Fed-NCL first identifies the noisy clients through well estimating the data quality and model divergence. Then robust layerwise aggregation is proposed to adaptively aggregate the local models of each client to deal with the data heterogeneity caused by the noisy clients. We further perform label correction on the noisy clients to improve the generalization of the global model. Experimental results on various datasets demonstrate that our algorithm boosts the performances of different state-of-the-art systems with noisy clients. Our code is available at https://github.com/TKH666/Fed-NCL.

12.
IEEE Trans Med Imaging ; PP2023 Dec 28.
Artigo em Inglês | MEDLINE | ID: mdl-38153819

RESUMO

Massive high-quality annotated data is required by fully-supervised learning, which is difficult to obtain for image segmentation since the pixel-level annotation is expensive, especially for medical image segmentation tasks that need domain knowledge. As an alternative solution, semi-supervised learning (SSL) can effectively alleviate the dependence on the annotated samples by leveraging abundant unlabeled samples. Among the SSL methods, mean-teacher (MT) is the most popular one. However, in MT, teacher model's weights are completely determined by student model's weights, which will lead to the training bottleneck at the late training stages. Besides, only pixel-wise consistency is applied for unlabeled data, which ignores the category information and is susceptible to noise. In this paper, we propose a bilateral supervision network with bilateral exponential moving average (bilateral-EMA), named BSNet to overcome these issues. On the one hand, both the student and teacher models are trained on labeled data, and then their weights are updated with the bilateral-EMA, and thus the two models can learn from each other. On the other hand, pseudo labels are used to perform bilateral supervision for unlabeled data. Moreover, for enhancing the supervision, we adopt adversarial learning to enforce the network generate more reliable pseudo labels for unlabeled data. We conduct extensive experiments on three datasets to evaluate the proposed BSNet, and results show that BSNet can improve the semi-supervised segmentation performance by a large margin and surpass other state-of-the-art SSL methods.

13.
IEEE Trans Med Imaging ; PP2023 Nov 28.
Artigo em Inglês | MEDLINE | ID: mdl-38015687

RESUMO

Medical imaging provides many valuable clues involving anatomical structure and pathological characteristics. However, image degradation is a common issue in clinical practice, which can adversely impact the observation and diagnosis by physicians and algorithms. Although extensive enhancement models have been developed, these models require a well pre-training before deployment, while failing to take advantage of the potential value of inference data after deployment. In this paper, we raise an algorithm for source-free unsupervised domain adaptive medical image enhancement (SAME), which adapts and optimizes enhancement models using test data in the inference phase. A structure-preserving enhancement network is first constructed to learn a robust source model from synthesized training data. Then a teacher-student model is initialized with the source model and conducts source-free unsupervised domain adaptation (SFUDA) by knowledge distillation with the test data. Additionally, a pseudo-label picker is developed to boost the knowledge distillation of enhancement tasks. Experiments were implemented on ten datasets from three medical image modalities to validate the advantage of the proposed algorithm, and setting analysis and ablation studies were also carried out to interpret the effectiveness of SAME. The remarkable enhancement performance and benefits for downstream tasks demonstrate the potential and generalizability of SAME. The code is available at https://github.com/liamheng/Annotation-free-Medical-Image-Enhancement.

14.
Nat Commun ; 14(1): 6757, 2023 10 24.
Artigo em Inglês | MEDLINE | ID: mdl-37875484

RESUMO

Failure to recognize samples from the classes unseen during training is a major limitation of artificial intelligence in the real-world implementation for recognition and classification of retinal anomalies. We establish an uncertainty-inspired open set (UIOS) model, which is trained with fundus images of 9 retinal conditions. Besides assessing the probability of each category, UIOS also calculates an uncertainty score to express its confidence. Our UIOS model with thresholding strategy achieves an F1 score of 99.55%, 97.01% and 91.91% for the internal testing set, external target categories (TC)-JSIEC dataset and TC-unseen testing set, respectively, compared to the F1 score of 92.20%, 80.69% and 64.74% by the standard AI model. Furthermore, UIOS correctly predicts high uncertainty scores, which would prompt the need for a manual check in the datasets of non-target categories retinal diseases, low-quality fundus images, and non-fundus images. UIOS provides a robust method for real-world screening of retinal anomalies.


Assuntos
Anormalidades do Olho , Doenças Retinianas , Humanos , Inteligência Artificial , Algoritmos , Incerteza , Retina/diagnóstico por imagem , Fundo de Olho , Doenças Retinianas/diagnóstico por imagem
15.
Med Image Anal ; 90: 102938, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37806020

RESUMO

Glaucoma is a chronic neuro-degenerative condition that is one of the world's leading causes of irreversible but preventable blindness. The blindness is generally caused by the lack of timely detection and treatment. Early screening is thus essential for early treatment to preserve vision and maintain life quality. Colour fundus photography and Optical Coherence Tomography (OCT) are the two most cost-effective tools for glaucoma screening. Both imaging modalities have prominent biomarkers to indicate glaucoma suspects, such as the vertical cup-to-disc ratio (vCDR) on fundus images and retinal nerve fiber layer (RNFL) thickness on OCT volume. In clinical practice, it is often recommended to take both of the screenings for a more accurate and reliable diagnosis. However, although numerous algorithms are proposed based on fundus images or OCT volumes for the automated glaucoma detection, there are few methods that leverage both of the modalities to achieve the target. To fulfil the research gap, we set up the Glaucoma grAding from Multi-Modality imAges (GAMMA) Challenge to encourage the development of fundus & OCT-based glaucoma grading. The primary task of the challenge is to grade glaucoma from both the 2D fundus images and 3D OCT scanning volumes. As part of GAMMA, we have publicly released a glaucoma annotated dataset with both 2D fundus colour photography and 3D OCT volumes, which is the first multi-modality dataset for machine learning based glaucoma grading. In addition, an evaluation framework is also established to evaluate the performance of the submitted methods. During the challenge, 1272 results were submitted, and finally, ten best performing teams were selected for the final stage. We analyse their results and summarize their methods in the paper. Since all the teams submitted their source code in the challenge, we conducted a detailed ablation study to verify the effectiveness of the particular modules proposed. Finally, we identify the proposed techniques and strategies that could be of practical value for the clinical diagnosis of glaucoma. As the first in-depth study of fundus & OCT multi-modality glaucoma grading, we believe the GAMMA Challenge will serve as an essential guideline and benchmark for future research.


Assuntos
Glaucoma , Humanos , Glaucoma/diagnóstico por imagem , Retina , Fundo de Olho , Técnicas de Diagnóstico Oftalmológico , Cegueira , Tomografia de Coerência Óptica/métodos
16.
IEEE Trans Med Imaging ; 42(12): 3871-3883, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37682644

RESUMO

Multiple instance learning (MIL)-based methods have become the mainstream for processing the megapixel-sized whole slide image (WSI) with pyramid structure in the field of digital pathology. The current MIL-based methods usually crop a large number of patches from WSI at the highest magnification, resulting in a lot of redundancy in the input and feature space. Moreover, the spatial relations between patches can not be sufficiently modeled, which may weaken the model's discriminative ability on fine-grained features. To solve the above limitations, we propose a Multi-scale Graph Transformer (MG-Trans) with information bottleneck for whole slide image classification. MG-Trans is composed of three modules: patch anchoring module (PAM), dynamic structure information learning module (SILM), and multi-scale information bottleneck module (MIBM). Specifically, PAM utilizes the class attention map generated from the multi-head self-attention of vision Transformer to identify and sample the informative patches. SILM explicitly introduces the local tissue structure information into the Transformer block to sufficiently model the spatial relations between patches. MIBM effectively fuses the multi-scale patch features by utilizing the principle of information bottleneck to generate a robust and compact bag-level representation. Besides, we also propose a semantic consistency loss to stabilize the training of the whole model. Extensive studies on three subtyping datasets and seven gene mutation detection datasets demonstrate the superiority of MG-Trans.


Assuntos
Processamento de Imagem Assistida por Computador , Semântica
17.
Med Image Anal ; 90: 102945, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37703674

RESUMO

Fundus photography is prone to suffer from image quality degradation that impacts clinical examination performed by ophthalmologists or intelligent systems. Though enhancement algorithms have been developed to promote fundus observation on degraded images, high data demands and limited applicability hinder their clinical deployment. To circumvent this bottleneck, a generic fundus image enhancement network (GFE-Net) is developed in this study to robustly correct unknown fundus images without supervised or extra data. Levering image frequency information, self-supervised representation learning is conducted to learn robust structure-aware representations from degraded images. Then with a seamless architecture that couples representation learning and image enhancement, GFE-Net can accurately correct fundus images and meanwhile preserve retinal structures. Comprehensive experiments are implemented to demonstrate the effectiveness and advantages of GFE-Net. Compared with state-of-the-art algorithms, GFE-Net achieves superior performance in data dependency, enhancement performance, deployment efficiency, and scale generalizability. Follow-up fundus image analysis is also facilitated by GFE-Net, whose modules are respectively verified to be effective for image enhancement.

18.
Br J Ophthalmol ; 2023 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-37495263

RESUMO

BACKGROUND: The crystalline lens is a transparent structure of the eye to focus light on the retina. It becomes muddy, hard and dense with increasing age, which makes the crystalline lens gradually lose its function. We aim to develop a nuclear age predictor to reflect the degeneration of the crystalline lens nucleus. METHODS: First we trained and internally validated the nuclear age predictor with a deep-learning algorithm, using 12 904 anterior segment optical coherence tomography (AS-OCT) images from four diverse Asian and American cohorts: Zhongshan Ophthalmic Center with Machine0 (ZOM0), Tomey Corporation (TOMEY), University of California San Francisco and the Chinese University of Hong Kong. External testing was done on three independent datasets: Tokyo University (TU), ZOM1 and Shenzhen People's Hospital (SPH). We also demonstrate the possibility of detecting nuclear cataracts (NCs) from the nuclear age gap. FINDINGS: In the internal validation dataset, the nuclear age could be predicted with a mean absolute error (MAE) of 2.570 years (95% CI 1.886 to 2.863). Across the three external testing datasets, the algorithm achieved MAEs of 4.261 years (95% CI 3.391 to 5.094) in TU, 3.920 years (95% CI 3.332 to 4.637) in ZOM1-NonCata and 4.380 years (95% CI 3.730 to 5.061) in SPH-NonCata. The MAEs for NC eyes were 8.490 years (95% CI 7.219 to 9.766) in ZOM1-NC and 9.998 years (95% CI 5.673 to 14.642) in SPH-NC. The nuclear age gap outperformed both ophthalmologists in detecting NCs, with areas under the receiver operating characteristic curves of 0.853 years (95% CI 0.787 to 0.917) in ZOM1 and 0.909 years (95% CI 0.828 to 0.978) in SPH. INTERPRETATION: The nuclear age predictor shows good performance, validating the feasibility of using AS-OCT images as an effective screening tool for nucleus degeneration. Our work also demonstrates the potential use of the nuclear age gap to detect NCs.

19.
Med Image Anal ; 88: 102802, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37315483

RESUMO

Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as de facto operators. Capitalizing on these advances in computer vision, the medical imaging field has also witnessed growing interest for Transformers that can capture global context compared to CNNs with local receptive fields. Inspired from this transition, in this survey, we attempt to provide a comprehensive review of the applications of Transformers in medical imaging covering various aspects, ranging from recently proposed architectural designs to unsolved issues. Specifically, we survey the use of Transformers in medical image segmentation, detection, classification, restoration, synthesis, registration, clinical report generation, and other tasks. In particular, for each of these applications, we develop taxonomy, identify application-specific challenges as well as provide insights to solve them, and highlight recent trends. Further, we provide a critical discussion of the field's current state as a whole, including the identification of key challenges, open problems, and outlining promising future directions. We hope this survey will ignite further interest in the community and provide researchers with an up-to-date reference regarding applications of Transformer models in medical imaging. Finally, to cope with the rapid development in this field, we intend to regularly update the relevant latest papers and their open-source implementations at https://github.com/fahadshamshad/awesome-transformers-in-medical-imaging.


Assuntos
Clorexidina , Idioma , Humanos , Redes Neurais de Computação
20.
IEEE J Biomed Health Inform ; 27(7): 3349-3359, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37126623

RESUMO

Automated brain tumor segmentation is crucial for aiding brain disease diagnosis and evaluating disease progress. Currently, magnetic resonance imaging (MRI) is a routinely adopted approach in the field of brain tumor segmentation that can provide different modality images. It is critical to leverage multi-modal images to boost brain tumor segmentation performance. Existing works commonly concentrate on generating a shared representation by fusing multi-modal data, while few methods take into account modality-specific characteristics. Besides, how to efficiently fuse arbitrary numbers of modalities is still a difficult task. In this study, we present a flexible fusion network (termed F 2Net) for multi-modal brain tumor segmentation, which can flexibly fuse arbitrary numbers of multi-modal information to explore complementary information while maintaining the specific characteristics of each modality. Our F 2Net is based on the encoder-decoder structure, which utilizes two Transformer-based feature learning streams and a cross-modal shared learning network to extract individual and shared feature representations. To effectively integrate the knowledge from the multi-modality data, we propose a cross-modal feature-enhanced module (CFM) and a multi-modal collaboration module (MCM), which aims at fusing the multi-modal features into the shared learning network and incorporating the features from encoders into the shared decoder, respectively. Extensive experimental results on multiple benchmark datasets demonstrate the effectiveness of our F 2Net over other state-of-the-art segmentation methods.


Assuntos
Neoplasias Encefálicas , Humanos , Neoplasias Encefálicas/diagnóstico por imagem , Benchmarking , Fontes de Energia Elétrica , Conhecimento , Processamento de Imagem Assistida por Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...